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PATENT mmm office 

This application is submitted in the name of the following inventor(s): 


3 Inventor Citizenship Residence City and State 

4 Mark MUHLESTEIN United States Tucson, Arizona 

5 

6 The assignee is Network Appliance, Inc., a Califomia corporation having an 

7 office at 495 East Java Drive, Sunnyvale, CA 94089. 

8 

9 Title of the Invention 

I — a 
3 

ipj Decentralized Appliance Virus Scanning 

i 3 

iT Background of the Invention 

1 ^ J. Field of the Invention 

ri 

1 7 This invention relates to virus scanning in a networked environment. 

18 

1 9 2. Related Art 

20 

21 Computer networking and the Intemet in particular offer end users un- 

22 precedented access to information of all types on a global basis. Access to information 
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1 can be as simple as connecting some type of computing device using a standard phone 

2 line to a network. With the proliferation of wireless communication, users can now ac- 

3 cess computer networks from practically anywhere. 

4 

5 Connectivity of this magnitude has magnified the impact of computer vi- 

6 mses. Virases such as "Melissa" and "I love you" had a devastating impact on computer 

7 systems worldwide. Costs for dealing with viruses are often measured in millions and 

8 tens of millions of dollars. Recently it was shown that hand-held computing devices are 

9 also susceptible to viruses. 
m 

Virus protection software can be very effective in dealing with viruses, and 

''-tj 

11 virus protection software is widely available for general computing devices such as per- 

1^ sonal computers. There are, however, problems unique to specialized computing devices, 

jij 

m such as filers (devices dedicated to storage and retrieval of data). Off-the-shelf virus 

li protection software will not mn on a specialized computing device unless it is modified to 

1 6 do so, and it can be very expensive to rewrite software to work on another platform. 

17 

18 A first known method is to scan for viruses at the data source. When the 

1 9 data is being provided by a specialized computing device the specialized computing de- 

20 vice must be scanned. Device-specific vims protection software must be written in order 

21 to scan the files on the device. 

22 
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While this first known method is effective in scanning files for viruses, it 


2 suffers from several drawbacks. First, a company with a specialized computing device 

3 would have to dedicate considerable resources to creating virus protection software and 

4 maintaining up-to-date data files that protect against new viruses as they emerge. 

5 


7 could enlist the assistance of a company that creates mainstream vims protection software 

8 to write the custom application and become a licensee this would create other problems, 

9 such as reliauce on the chosen vendor of the anti-virus software, compatibility issues 
ii when hardware upgrades are effected, and a large financial expense. 


|2t the end user run anti-virus software on their client device. Anti-virus software packages 

II are offered by such companies as McAfee and Symantec. These programs are loaded 

15 during the boot stage of a computer and work as a background job monitoring memory 

III 

1 6 and files as they are opened and saved. 

17 

1 8 While this second known method is effective at intercepting and protecting 

19 the client device from infection, it suffers from several drawbacks. It places the burden 

20 of detection at the last possible link in the chain. If for any reason the virus is not de- 

21 tected prior to reaching the end user it is now at the computing device where it will do the 

22 most damage (corrupting files and spreading to other computer users and systems). 
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Additionally, although a manufacturer of a specialized computing device 
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1 

2 It is much better to sanitize a file at the source from where it may be deliv- 

3 ered to milUons of end users rather than deliver the file and hope that the end user is pre- 

4 pared to deal with the file in the event the file is infected. End users often have older ver- 

5 sions of anti-virus software and/or have not updated the data files that ensure the software 

6 is able to protect against newly discovered viruses, thus making detection at the point of 

7 mass distribution even more critical. 

8 

9 Also, hand-held computing devices are susceptible to viruses, but they are 
poorly equipped to handle them. Generally, hand-held computing devices have very lim- 

ijj ited memory resources compared to desktop systems. Dedicating a portion of these re- 

il sources to virus protection severely limits the ability of the hand-held device to perform 
effectively. Reliable virus scanning at the information source is the most efficient and 

in 

V% effective method. 

ii 

1 6 Protecting against viruses is a constant battle. New viruses are created eve- 

1 7 ryday requiring virus protection software manufacturers to come up with new data files 

18 (solution algorithms used by anti- virus applications). By providing protection at the 

1 9 source of the file, viruses can be eliminated more efficiently and effectively. 

20 

21 Security of data in general is important. Equally important is the trust of the 

22 end user. This comes from the reputation that precedes a company, and companies that 
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engage in web commerce often live and die by their reputation. Just like an end user 
trusts that the credit card number they have just disclosed for a web-based sales transac- 
tion is secure they want files they receive to be just as secure. 

Accordingly, it would be desirable to provide a technique for scanning spe- 
cialized computing devices for vimses and other malicious or unwanted content that may 
need to be changed, deleted, or otherwise modified. 
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1 Summary of the Invention 

2 

3 The invention provides a method and system for scanning specialized com- 

4 puting devices (such as filers) for viruses. In a preferred embodiment, a filer is connected 

5 to one or more supplementary computing devices that scan requested files to ensure they 

6 are virus free prior to delivery to end users. When an end user requests a file from the 

7 filer the follov^ing steps occur: First, the filer determines whether the file requested must 

8 be scanned before delivery to the end user. Second, the filer opens a channel to one of the 

9 extemal computing devices and sends the filename. Third, the extemal computing device 
iM opens the file and scans it. Fourth, the extemal computing device notifies the filer the 
ill status of the file scan operation. Fifth, the filer sends the file to the end user provided the 
ll Status indicates it may do so. 

f3 

This system is very efficient and effective as a file needs only to be scanned 

i|g one time for a vims unless the file has been modified or new data files that protect against 

16 new vimses have been added. Scan reports for files that have been scanned may be 

1 7 stored in one or more of the extemal computing devices, in one or more filers, and some 

1 8 portion of a scan report may be delivered to end users. 

19 

20 In altemative embodiments of the invention one or more of the extemal 

21 computing devices may be mnning other supplementary applications, such as file com- 

22 pression and encryption, independently or in some combination. 
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2 Brief Description of the Drawings 

3 

4 Figure 1 shows a block diagram of a system for decentralized appliance vi- 

5 rus scanning. 

6 Figure 2 shows a process flow diagram for a system for decentralized virus 

7 scanning 

8 

9 Detailed Description of the Preferred Embodiment 

If. In the following description, a preferred embodiment of the invention is de- 
ll scribed with regard to preferred process steps and data structures. Those skilled in the art 
T3 would recognize after perusal of this application that embodiments of the invention can 
i]^ be implemented using one or more general purpose processors or special purpose proces- 

ti sors or other circuits adapted to particular process steps and data stmctures described 

u 

1 6 herein, and that implementation of the process steps and data structures described herein 

1 7 would not require undue experimentation or further invention. 

18 

1 9 Lexicography 

20 
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The following terms refer or relate to aspects of the invention as described 
below. The descriptions of general meanings of these terms are not intended to be limit- 
ing, only illustrative. 


^ Viftts- ^ ill general, a niamn ade ^rogram or ^ icec of code that i &^ loa dediiQto-a pj Qm *- 

puter without the computer user's knowledge and runs againsWheirwishes. Most 
viruses can also replicate themselyes^^d-tKemore dangerous types of viruses are 
capable ofl.b=ari§n3tting themselves across networks and bypassing security sys- 

• client and server — in general, these terms refer to a relationship between two de- 
vices, particularly to their relationship as client and server, not necessarily to any 
particular physical devices. 


For example, but without limitation, a particular client device in a first relationship 
with a first server device, can serve as a server device in a second relationship with 
a second client device. In a preferred embodiment, there are generally a relatively 
small number of server devices servicing a relatively larger number of client de- 
vices. 


Express Mailing EL524780446US g 



103.1056.01 


1 • client device and server device — in general, these terms refer to devices taking 

2 on the role of a client device or a server device in a client-server relationship (such 

3 as an HTTP web client and web server). There is no particular requirement that 

4 any client devices or server devices must be individual physical devices. They can 

5 each be a single device, a set of cooperating devices, a portion of a device, or some 

6 combination thereof 

7 

8 For example, but without limitation, the client device and the server device in a 

9 client-server relation can actually be the same physical device, with a first set of 

i 3 

m software elements serving to perform client functions and a second set of software 

111 

1^- elements serving to perform server functions. 
11 

f 3 • web client and web server (or web site) — as used herein the terms "web clienf 

ill 

15 and "web server" (or "web site") refer to any combination of devices or software 
M taking on the role of a web client or a web server in a client-server environment in 

1 6 the internet, the world wide web, or an equivalent or extension thereof There is 

17 no particular requirement that web clients must be individual devices. They can 

1 8 each be a single device, a set of cooperating devices, a portion of a device, or some 

1 9 combination thereof (such as for example a device providing web server services 

20 that acts as an agent of the user). 

21 
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1 As noted above, these descriptions of general meanings of these terms are 

2 not intended to be Hmiting, only illustrative. Other and further applications of the inven- 

3 tion, including extensions of these terms and concepts, would be clear to those of ordinary 

4 skill in the art after perusing this application. These other and further applications are 

5 part of the scope and spirit of the invention, and would be clear to those of ordinary skill 

6 in the art, without further invention or undue experimentation. 


8 System Elements 


20 


21 


Figure 1 shows a block diagram of a system for decentralized appliance vi- 


9 

ii 

j'l rus scanning. 

""■4 

12 

73 

communications network 120, a filer 130, and a processing cluster 140. 


A system 100 includes a client device 110 associated with a user 111, a 



^1 4ie client de\ 4 ce 110 includoG Q - p r o c c3Gor, a^r 
for executing instmctions (not shown, but understqod^by-<^Ji^'^kirf53' the art). Although 

18 the client device 1 l(l^ndH5teFT40 are shown as separate devices there is no requirement 

19 - that^tficy be pt^^SiuHH y - sepaFat e^ 


In a preferred embodiment, the communication network 120 includes the 


22 Internet. In alternative embodiments, the communication network 120 may include alter- 
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native forms of communication, such as an intranet, extranet, virtual private network, di- 
rect communication links, or some other combination or conjunction thereof. 

A communications link 115 operates to couple the client device 1 10 to the 
communications network 120. 

The filer 130 includes a processor, a main memory, software for executing 
instructions (not shown, but understood by one skilled in the art), and a mass storage 131. 
Although the client device 1 10 and filer 130 are shown as separate devices there is no re- 
quirement that they be separate devices. The filer 130 is connected to the communica- 
tions network 120. 

The mass storage 131 includes at least one file 133 that is capable of being 
requested by a client device 1 10. 

The processing cluster 140 includes one or more cluster device 141 each 
including a processor, a main memory, software for executing instructions, and a mass 
storage (not shown but understood by one skilled in the art). Although the filer 130 and 
the processing cluster 140 are shown as separate devices there is no requirement that they 
be separate devices. 
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1 In a preferred embodiment the processing cluster 140 is a plurality of per- 

2 sonal computers in an interconnected cluster capable of intercommunication and direct 

3 communication with the filer 130. 

4 

5 The cluster link 135 operates to connect the processing cluster 140 to the 

6 filer 130. The cluster link 135 may include non-uniform memory access (NUMA), or 

7 communication via an intranet, extranet, virtual private network, direct communication links, 

8 or some other combination or conjunction thereof 
9 

u 

ii Method of Operation 

i 

\n Figure 2 shows a process flow diagram for a system for decentralized appli- 

13 ance virus scanning. 

ii 

iig A method 200 includes a set of flow points and a set of steps. The system 

16 100 performs the method 200. Although the method 200 is described serially, the steps of 

17 the method 200 can be performed by separate elements in conjunction or in parallel, 

1 8 whether asynchronously, in a pipelined manner, or otherwise. There is no particular re- 

1 9 quirement that the method 200 be performed in the same order in which this description 

20 lists the steps, except where so indicated. 

21 
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At a flow point 200, the system 100 is ready to begin performing the 

method 200. 

At a step 201, a user 1 1 1 utilizes the client device 1 10 to initiate a request 
for a file 133. The request is transmitted to the filer 130 via the communications network 
120. In a preferred embodiment the filer 130 is performing file retrieval and storage at 
the direction of a web server (not shown but understood by one skilled in the art). 

At a step 203, the filer 130 receives the request for the file 133 and sends 
the file ID and path of the file 133 to the processing cluster 140 where it is received by 
one of the cluster device 141. 

At a step 205, the cluster device 141 uses the file ID and path to open the 
file 133 in the mass storage 131 of the filer 130. 

At a step 207, the cluster device 141 scans the file 133 for vimses. In a pre- 
ferred embodiment, files are tasked to the processing cluster 140 in a round robin fashion. 
In altemative embodiments files may be processed individually by a cluster device 141, 
by multiple cluster device 141 simultaneously, or some combination thereof. Load bal- 
ancing may be used to ensure maximum efficiency of processing within the processing 
cluster 140. 
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1 There are several vendors offering virus protection software for personal 

2 computers, thus the operator of the filer 130 may choose whatever product they would 

3 like to use. They may even use combinations of vendors' products in the processing 

4 cluster 140. In an alternative embodiment of the invention, continual scanning of every 

5 file 1 33 on the filer 130 may take place. 



The profce»stng7:rlttstet U.40 is h ig hly 9C Qkabl^;M:3^g"prrce-e^ f persona l-6^^ 

puters is low compared to dedicated devices^^^ueKas filers, therefore this configuration is 

9 very desirable. Additionally^^a^i^klster configuration offers redundant systems availability 
□ ^^^^^ 

fi in case a clustei:,.device 141 fails - failover and takeover is also possible within the proc- 

— 

ill , 

ij3 At a step 209, the cluster device 141 transmits a scan report to the filer 130. 

ijlj The scan report primarily reports whether the file is safe to send. Further information 

lis may be saved for statistical purposes (for example, how many files have been identified 

16 as infected, was the virus software able to sanitize the file or was the file deleted) to a 

17 database. The database may be consulted to determine whether the file 133 needs to be 

18 scanned before delivery upon receipt of a subsequent request. If the file 133 has not 

1 9 changed since it was last scanned and no additional virus data files have been added to 

20 the processing cluster, the file 133 probably does not need to be scanned. This means the 

21 file 133 can be delivered more quickly. 

22 
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1 Other intermediary applications may also run separately, in conjunction 

2 with other applications, or in some combination thereof within the processing cluster 140. 

3 Compression and encryption utilities are some examples of these applications. These 

4 types of applications, including virus scanning, can be very CPU intensive, thus 

5 outsourcing can yield better performance by allowing a dedicated device like a filer to do 

6 what it does best and farm out other tasks to the processing cluster 140. 

7 

8 At a step 21 1, the filer 130 transmits or does not transmit the file 133 to the 

9 client 110 based on its availability as reported following the scan by the processing clus- 

u 

iM ter 140. Some portion of the scan report may also be transmitted to the user. 

M At this step, a request for a file 133 has been received, the request has been 

hi 

i!3 processed, and if possible a file 133 has been delivered. The process may be repeated at 

m 

i|4i Step 201 for subsequent requests. 

1® 


1 6 Generality of the Invention 

17 

1 8 The invention has wide applicability and generality to other aspects of proc- 

1 9 essing requests for files. 

20 

21 The invention is applicable to one or more of, or some combination of, cir- 

22 cumstances such as those involving: 
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• file compression; 

• file encryption; and 

• general outsourcing of CPU intensive tasks from dedicated appliances to gen- 
eral purpose computers. 

Alternative Embodiments 

Although preferred embodiments are disclosed herein, many variations are 
possible which remain within the concept, scope, and spirit of the invention, and these 
variations would become clear to those skilled in the art after perusal of this application. 
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